AITopics | contextual policy

Collaborating Authors

contextual policy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

d5a28f81834b6df2b6db6d3e5e2635c7-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-14-2026, 10:05:15 GMT

carml, pixel observation, task distribution, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.44)

Add feedback

faff959d885ec0ecf70741a846c34d1d-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 05:12:22 GMT

bayesian optimization, kernel, optimization, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Ohio (0.04)
North America > Canada (0.04)

Genre: Research Report > Experimental Study (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

CEIL: Generalized Contextual Imitation Learning

Neural Information Processing SystemsDec-27-2025, 03:50:29 GMT

Inspired by the formulation of hindsight information matching, we derive CEIL by explicitly learning a hindsight embedding function together with a contextual policy using the hindsight embeddings. To achieve the expert matching objective for IL, we advocate for optimizing a contextual variable such that it biases the contextual policy towards mimicking expert behaviors. Beyond the typical learning from demonstrations (LfD) setting, CEIL is a generalist that can be effectively applied to multiple settings including: 1) learning from observations (LfO), 2) offline IL, 3) cross-domain IL (mismatched experts), and 4) one-shot IL settings.

ceil, generalized contextual imitation learning, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.65)

Add feedback

d5a28f81834b6df2b6db6d3e5e2635c7-AuthorFeedback.pdf

Neural Information Processing SystemsAug-20-2025, 04:20:26 GMT

carml, pixel observation, task distribution, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.44)

Add feedback

High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization

Neural Information Processing SystemsAug-17-2025, 09:18:04 GMT

Here we consider contextual policies that are a map from a discrete context to a set of continuous parameters . For example, video streaming and real-time conferencing systems use adaptive bitrate (ABR) algorithms to balance between video quality and uninterrupted playback.

bayesian optimization, kernel, optimization, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Ohio (0.04)
North America > Canada (0.04)

Genre: Research Report > Experimental Study (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

CEIL: Generalized Contextual Imitation Learning

Neural Information Processing SystemsJan-20-2025, 01:54:53 GMT

ceil, contextual policy, generalized contextual imitation learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation

Liu, Jinxin, Guo, Xinghong, Zhuang, Zifeng, Wang, Donglin

arXiv.org Artificial IntelligenceMay-23-2024

In this paper, we propose a novel approach called DIffusion-guided DIversity (DIDI) for offline behavioral generation. The goal of DIDI is to learn a diverse set of skills from a mixture of label-free offline data. We achieve this by leveraging diffusion probabilistic models as priors to guide the learning process and regularize the policy. By optimizing a joint objective that incorporates diversity and diffusion-guided regularization, we encourage the emergence of diverse behaviors while maintaining the similarity to the offline data. Experimental results in four decision-making domains (Push, Kitchen, Humanoid, and D4RL tasks) show that DIDI is effective in discovering diverse and discriminative skills. We also introduce skill stitching and skill interpolation, which highlight the generalist nature of the learned skill space. Further, by incorporating an extrinsic reward function, DIDI enables reward-guided behavior generation, facilitating the learning of diverse and optimal behaviors from sub-optimal data.

arxiv preprint arxiv, didi, learning, (13 more...)

arXiv.org Artificial Intelligence

2405.1479

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Contextual Bandits in a Survey Experiment on Charitable Giving: Within-Experiment Outcomes versus Policy Learning

Athey, Susan, Byambadalai, Undral, Hadad, Vitor, Krishnamurthy, Sanath Kumar, Leung, Weiwen, Williams, Joseph Jay

arXiv.org Artificial IntelligenceNov-21-2022

We design and implement an adaptive experiment (a ``contextual bandit'') to learn a targeted treatment assignment policy, where the goal is to use a participant's survey responses to determine which charity to expose them to in a donation solicitation. The design balances two competing objectives: optimizing the outcomes for the subjects in the experiment (``cumulative regret minimization'') and gathering data that will be most useful for policy learning, that is, for learning an assignment rule that will maximize welfare if used after the experiment (``simple regret minimization''). We evaluate alternative experimental designs by collecting pilot data and then conducting a simulation study. Next, we implement our selected algorithm. Finally, we perform a second simulation study anchored to the collected data that evaluates the benefits of the algorithm we chose. Our first result is that the value of a learned policy in this setting is higher when data is collected via a uniform randomization rather than collected adaptively using standard cumulative regret minimization or policy learning algorithms. We propose a simple heuristic for adaptive experimentation that improves upon uniform randomization from the perspective of policy learning at the expense of increasing cumulative regret relative to alternative bandit algorithms. The heuristic modifies an existing contextual bandit algorithm by (i) imposing a lower bound on assignment probabilities that decay slowly so that no arm is discarded too quickly, and (ii) after adaptively collecting data, restricting policy learning to select from arms where sufficient data has been gathered.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2211.12004

Country: